Skip to content

Improve the initializer Interface for fc, sequence_conv and conv2d layers #5760

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 20, 2017

Conversation

abhinavarora
Copy link
Contributor

@abhinavarora abhinavarora commented Nov 18, 2017

This PR implements the following:

  1. Implement new initializer interface for fc, sequence_conv and conv2d layers.
  2. Enforce default initializer for any parameter created through layer_helper.
  3. Default initializer depends on the dtype of the parameter. For float types (FP16, FP32, FP64) default will be Uniform Xavier initializer. For integer types, it will be Zeroes initializer.

@@ -15,6 +15,37 @@ def unique_name(prefix):
return "_".join([prefix, str(uid)])


def convert_np_dtype_to_dtype_(np_dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why make type convert as a global function? I think the staticmethod is more proper here because we can not call type convert function out of Variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason this was made a global function is because this is needed outside of the Variable class. In layer_helper, we want to make sure that every parameter which has not been supplied an initializer has a default initializer. This initializer depends on the dtype of the parameter. If the parameter is of type float, then XavierInitializer is used otherwise the parameter is initialized with Zeros for int and bool types.

Now we need this method outside because users can also pass np datatypes as dtypes. The initializer needs to be specified in layer_helper and hence we need to check whether the supplied datatype (which could be np.datatype or core.DataType) is of type float. Do you have any suggestion on how to accomplish this without making this global?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find a solution resolve this problem. convert_np_dtype_to_dtype_ goes the wrong way...
this function just makes user can configure a data type string float32, float64. But we should only let user configure support datatype like paddle.float32, paddle.float64, and make the real type conversion(from/to numpy) happens in the feed/fetch implementation.

if not isinstance(dtype, core.DataType):
dtype = convert_np_dtype_to_dtype_(dtype)

if (dtype == core.DataType.FP16 or dtype == core.DataType.FP16 or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dtype == core.DataType.FP16 should be FP32 here. But I think that we need more general type asserts in c++ side, just throw an exception when the configuration is wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

dzhwinter
dzhwinter previously approved these changes Nov 20, 2017
Copy link
Contributor

@dzhwinter dzhwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dzhwinter is rewriting the Python type, will remove the dtype_is_floating check. Other parts look good to me.

Copy link
Contributor

@dzhwinter dzhwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approved. LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants